Toward a Framework for the Large Scale Textual and Contextual Analysis of Government Information Declassification Patterns

نویسندگان

  • Rachel Shorey
  • Hanna Wallach
چکیده

The US government protects a massive amount of secret data as part of its Security Classification System. This information is expensive to protect and maintain. In order to keep citizens informed, as well as to keep costs down, the government is constantly releasing newly declassified documents to the public. According to OpenTheGovernment.org’s annual Secrecy Report Card, human readers manually declassified almost 29 million pages of information in 2009 alone (McDermott and Bennett, 2010). Scholars interested in learning about government transparency history and policy face a daunting task in examining even a small portion of these documents. In order to make the process of learning about the content of these documents easier, we investigate the documents available through Gale’s Declassified Documents Reference System, an electronic repository of once-classified documents created throughout the 20th century, along two dimensions. First, we perform survival analysis to consider questions relating to time, such as when documents were created and how long they tend to remain classified. Then, we examine the contents of the documents via the use of statistical topic models. Finally, we combine temporal and content information, both by using the output of a topic model to inform a proportional hazards survival model and by considering time information during inference in a topic model. In this paper, we present a range of results that arise from a combined analysis of the temporal features and textual content of declassified documents. Since the results of statistical topic models tend to be easily-interpretable by humans, these new ways of looking at data relating to declassified government documents will likely be useful to experts in fields relating to government policy on secrecy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards the Development of a Socially-Informed, Process-Oriented Model of Research in Metadiscourse

Since the early development of interest in the interpersonal dimensions of academic communication in the 1980s, the analytic potentials of the concept of metadiscourse have motivated a large number of investigations. Although these analytic potentials have facilitated the study of diverse academic genres, there has always been a risk of detachment of textual analyses form the contextual origins...

متن کامل

TEXTUAL AND INTER-TEXTUAL ANALYSES OF IRANIAN EFL UNDERGRADUATES’ TYPES OF ENGLISH READING TOWARDS DEVELOPING A CAREFUL READING FRAMEWORK

This study investigated textual and inter-textual reading of a group of Iranian EFL undergraduates’ careful English reading types. In this research, Khalifa and Weir’s (2009) reading framework was used to propose a more inclusive aspect of a careful reading framework and the reading construct for instructional and assessment goals. The participants of this study were B.A. students of English Tr...

متن کامل

پیشنهاد چارچوبی مفهومی جهت معماری دانش سازمان‌های کلان مقیاس

The main concern for most organizations in this age, which has been called the age of knowledge-based economy, is their success and superiority in competitive markets. Reviewing the parameters that might have been effective on the success of operationalizing a knowledge management project leads us to a potential factor as knowledge architecture. With regard to the significant effect of knowledg...

متن کامل

دولت الکترونیکی

Electronic government is an offshoot of the information technology and management of this technology and management of this technology. If implemented in the state-owned sector, e-government will effect tremendous changes in the efficiency and effectiveness of goveernment systems and services, leading to large scale popular satisfaction. While elaborating on this modern global concept, the pre...

متن کامل

A Comparative Analysis of the Effect of Visual and Textual Information on the Health Information Perception of High School Girl Students in Tehran

Purpose: Information and information sources can be divided into three broad categories according to their nature or type: textual information (book, journal article, conference paper, dissertation, newspaper, etc.), visual information (infographic, photo, Cartoons, films, etc.) and audiovisual information. The purpose of this study is to determine the effect of reading textual information in c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011